Welcome to Python for Machine Learning!

image.png

Lecture recording - Python for ML, Session 1

In [261]:
from IPython.display import IFrame
IFrame("https://drive.google.com/file/d/1I3D9MangVoBvsyede4e3SRC_h5HbiyYh/preview", width="640", height="480")
Out[261]:

My name is Ani Kannal

image.png

image.png

image.png

image.png

image.png

image.png

image.png

image.png

The first language I learnt was Fortran

I was very good at C, C++, Jess, Lisp, Prolog

I was mildly good at Java

I hadn't coded in many years and I was missing it. So I tried again and it was painful!

I had a big realization ...

  • I am a lazy coder!
  • I like the fun bits of coding but I hate getting stuck

  • I want to do things fast

  • I like quick feedback, not spending hours to realize I was doing something wrong

  • I would rather focus on my problem

So .... Python!

  • Simple programming syntax. Readable code.
    • Easy to learn and use
  • Versatile and popular
    • Can solve varied problems, Can be a great foundation for a career
  • Extensive community support
    • forums, opensource libraries, frameworks, and tools (I dont have to write everything myself)
  • Support for Data Science and Machine Learning
    • I can start experimenting and doing projects immediately

A few rules for the road -

  • Participate or you will not learn
    • work with me and do your assignments
    • dont just watch and leave, this is not a movie!
 - Go watch Netflix if you want to chill!
  • Hands on only! At xcelerator we learn by doing.
  • Dont be scared to try
    • Experiment or you will not learn
    • You will not learn everything in class
    • Keep playing to learn more
  • Share what you learn. Help others.

Anacondaimage.png

Jupyter Notebookimage.png

xcelerator.ninjaimage.png

GitHubimage.png

What will you learn?

  • Python fundamentals
    • Data types and Variables
    • Comments
    • Operators
    • Conditional Statements and loops
    • Functions
    • Exception Handling
    • File Handling
    • Lists, Tuples and Dictionaries
    • Packages and Libraries
  • Regular Expressions and Formatting

  • Machine Learning using Python

    • Supervised Learning using Python libraries
    • Unsupervised Learning using Python

Tell us a bit about you...

image.png

Lets start with Python

Lecture recording - Python for ML, Session 2

In [260]:
from IPython.display import IFrame
IFrame("https://drive.google.com/file/d/1oiPcoXac54AeuMgNgfL_mEkizIexBFT-/preview", width="640", height="480")
Out[260]:
In [12]:
# You can use python like a calculator
# What are the different operators?

2+3
4*5

#2,3,4 operands and + * operators

#complex expressions
7*2-8**2/4

#Some more expressions
x = 5
x = 1 + 2 * 3 - 4 / 5 ** 6
print(x)

#How about some brackets?
#x = 1 + (2 * 3) - 4 / (5 ** 6) #A
#x = (1 + 2) * 3 - 4 / (5 ** 6) #B
#x = (1 + 2 * 3) - 4 / (5 ** 6) #C
#print(x)


#Other operators
#** % 
# / vs // - discuss examples

2**2

9%2

7//2
6.999744
Out[12]:
3
In [18]:
# Comments start with a hash 

# what is a variable?

# You can define different type of variables

x = 2
name = "Python"

# print(name)
# print(type(name))
# price = 4.5

name = 73
# print(name)
# print(type(name))

#To print a value use print
#print(x)
#print(new_str)
#print(price)

#To know what kind of data a variable has use type
# print(x)
type(x)
#help(x)
# print(type(name))
# print(type(price))
# print(type(print))

# #data types in python
# #int, float, str, bool

#print(type(True))

# #using snake case to name variables
# #cant start with number but can end with it

y = x
y = 4
print(id(x))
print(id(y))

# 2**4
4553413808
4553413872
In [3]:
#Operate upon these variables
# y = x + 4
# print(y)

# name = "Python"
# fullname = name + " Language"
# print(fullname)
price = 100
x = 2
amount = price*x
print(amount)
200
In [8]:
#More examples - variables in python
# h = 5
# w = 2
# a = h * w
# print(a)

# Use meaningful variable names
# height = 60
# width = 40
# area_of_rectangle = height * width
# print(area_of_rectangle)

# Resolving variable values
x = 2
x = 4   *   x   *   (  1   -   x  )
print(x)

print(x**2 + 4 * 20)
-8
144
In [10]:
#Operators can behave differently on different data types
print(3+4)
print("hi " + "ani!")

print(4 + "I live in house number") #will this work? If yes, why? If not, why not?
7
hi ani!
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-03ffd076f92c> in <module>
      3 print("hi " + "ani!")
      4 
----> 5 print(4 + "I live in house number") #will this work? If yes, why? If not, why not?

TypeError: unsupported operand type(s) for +: 'int' and 'str'
In [47]:
#Converting types
#Convert to int
# inp = input('Indian floor? ')
# usf = int(inp) + 1
# print ('US floor ', usf)

def func_tempreture():
inp = input('Tempreture in Celcius ')

    #print(type(inp))
    # isalpha(inp)

    if inp.isalpha():
        far = "Error"
    else:
        far = float(inp)*2 + 32

    print ('Tempreture in Farenheit '+ str(far))
    
func_tempreture()


def func_area(radius):
    pi = 3.14
    area = pi*radius**2
    return area

func_area(3)


#function - encapsulate code
#class - encapsulates code and data

dir(str)
help(str)
Out[47]:
['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']
In [3]:
#internal functions
string = "xcelerator"
print(len(string))
max(string)
min(string)

print(string)
type(string)
xcelerator
None
In [ ]:
## Function overloading

def len(x):
    print(x)
print(len(string))

len(5)
In [15]:
#Read from user
nam = input('Who are you? ')
print ('Welcome ', nam, end = '\t')
print ('good to see you')

#help(print)
#read(print)

# a
# b
# c
Who are you? ani
Welcome  ani	good to see you
Out[15]:
'\nancn\nadad\nads\n'
In [54]:
#using modules and packages

#import math

#from math import *

#dir(math)

# from math import pi

# area = pi * 7**2

# print(area)
153.93804002589985

https://wiki.python.org/moin/UsefulModules#Scientific

we will create our own module later

What have we done till now?

  • Operators and operands
  • Writing expressions
  • Variables, variable types, and typecasting
  • Using internal functions
    • printing to the screen
    • reading input from user
  • Function overriding
  • How to ask for help from Python? dir() and help()

A few questions/comments -

Great response to the quizzes and assignments overall. I hope you are all doing them.

  1. Do you have your Python environment setup?
  2. Are you practicing?
  3. Do you have access to xcelerator? - content, exercises, club
  4. Have you created your own GitHub account?
  5. Have you forked my GitHub repo yet?

Agenda for today

  • Exception handling
  • Conditional execution
  • Loops in Python - while and for
  • Defining functions - code encapsulation
  • Classes and Objects - code and data encapsulation
  • Strings
  • Lists
  • Tuples

Lecture recording - Python for ML, Session 3

In [259]:
from IPython.display import IFrame
IFrame("https://drive.google.com/file/d/19dQn0Sg4IzCNjyzBkXg5mGt2h20itcrI/preview", width="640", height="480")
Out[259]:
In [48]:
# Exception handling

try:
    height = float(input("Enter Height "))
    width = float(input("Enter Width "))
    area = height*width
    print("Area:" + str(area))
except:
    print("please enter a float or int value")
    try: 
        height = float(input("Enter Height "))
        width = float(input("Enter Width "))
        area = height*width
        print("Area:",area)
    except:
        print("please enter a float or int value")
Enter Height abc
please enter a float or int value
Enter Height abc
please enter a float or int value
In [51]:
#Boolean values

1 == 1
1 > 2

#type(1==1)
#type(true)

# x = 3
# y = 4
# x != y               # x is not equal to y
# x > y                # x is greater than y
# x < y                # x is less than y
# x >= y               # x is greater than or equal to y
# x <= y               # x is less than or equal to y
# x is y               # x is the same as y
# x is not y           # x is not the same as y


#Lets do a couple of examples of conditional execution
Out[51]:
False

image.png

In [55]:
#Lets write these conditional execution statements together.

x = input('X')
y = input('Y')

if x == y :
    print('equal')
else:
    if x < y:
        print ('less')
    else:
        print ('greater')
X4
Y5
less
In [ ]:
 
In [61]:
#Indefinite loops - while
i = 0
while i < 5:
    print(i)
    i = i + 1
    if (i==3):
        break


#Definite loops - for

# for i in range(5):
#     print(i)
#     i = 1

#     print(id(i))
#     i=2
#     print(id(i))

# Continue and Break
0
1
2
3
4
In [63]:
done_flag = 0
while True:
    try:
        height = float(input("Enter Height "))
        width = float(input("Enter Width "))
        break
    except:
        print("invalid input format, please enter numbers")

area = height*width

print("Area:" + str(area))
Enter Height abc
invalid input format, please enter numbers
Enter Height 5
Enter Width 5
Area:25.0
In [86]:
# Defining your own functions
# def is a keyword that indicates that this is a function definition. The name of the function is print_name.
# You can't use a keyword as the name of a function, and you should avoid having a variable and a function with the same name.
# The empty parentheses after the name indicate that this function doesn't take any arguments. 
# The first line of the function definition is called the header; the rest is called the body. 
# The header has to end with a colon and the body has to be indented. 
# By convention, the indentation is always four spaces. The body can contain any number of statements.

# def print_name():
#     name = input('What is your name? ')        
#     print("My name is", name)

# print_name()

# print(type(print_name))

#what are void functions?
#void function vs. return

import math

def new_function(num):
    mult_pi = num*math.pi
    return mult_pi

def void_function(num1, num2):
    var = num1*num2
#    print(var)
    output = new_function(var)
#    print(output)

# out_void = void_function(2,2)

output = print("abc")
print(output)

#type(new_function)

var = input("input value ")
type(var)


# mac - cmd + /
# win - ctrl + /

    

    
abc
None
input value abc
Out[86]:
str
In [88]:
def func_1(num):
    total = 0
    total = total+num
    return total

func_1(5)
func_1(3)
Out[88]:
3
In [102]:
class Person:
    name = 'no name yet'
    age = 0.0
    
    def __init__(self, name_inp, age_inp):
        self.name = name_inp
        self.age = age_inp
    
    def increment(self):
        self.age = self.age + 1
        
    
    
#s1 = Person("Ani",2)
#s2 = Person("Piyush",20)

print(s2.name)
print(s2.age)

s2.increment()
print(s2.age)
Piyush
21
22

What have we done till now?

  • Operators and operands
  • Writing expressions
  • Variables, variable types, and typecasting
  • Using internal functions
    • printing to the screen
    • reading input from user
  • Function overriding
  • How to ask for help from Python? dir() and help()

Session 3

  • Exception handling
  • Conditional execution
  • Loops in Python - while and for
  • Defining functions - code encapsulation
  • Classes and Objects - code and data encapsulation
  • Strings (initial discussion)

A few questions/comments -

Great response to the quizzes and assignments overall. I hope you are all doing them.

  1. Do you have your Python environment setup?
  2. Are you practicing?
  3. Do you have access to xcelerator? - content, exercises, club
  4. Have you created your own GitHub account?
  5. Have you forked my GitHub repo yet?

Agenda for today

  • Strings
  • Collections in Python
    • Lists
    • Tuples
    • Sets
    • Dictionaries

Lecture recording - Python for ML, Session 4

In [258]:
from IPython.display import IFrame
IFrame("https://drive.google.com/file/d/1pg92rAdRrmrrRa3DO-Rq6dl0iDdYI4pD/preview", width="640", height="480")
Out[258]:
In [25]:
## Strings!!

b = "Hello, World!"
print(b)
b = 'Hello, World!'
print(b)

c = 'My name\ns Ani'
print(c)

# c = "My name's Ani \\\\"
# print(c)
Hello, World!
Hello, World!
My name
s Ani
In [30]:
## Strings are arrays

b = "Hello, World!"
b[5]

# # Slicing strings

a = b[2:5]
print(a)

a = b[2:-2]
print(a)

a = b[1:8:3]
print(a)
llo
llo, Worl
eoW
In [35]:
## String Functions

# len()
# strip()
# rstrip()
# lstrip()
# lower()
# upper()
# replace()
# split()

a = "Hello, World!"

# len(a)

# for i in range(len(a)):
#     char = a[i]
#     print (char)

    # print(a)
    
#print(a.replace("o", "a"))

b = a.split()
b


# print(a.upper())

# #" ".join(b)
Hella, Warld!
Out[35]:
['Hello,', 'World!']
In [48]:
## String Operators + in

# string = 'Hello' + ' ' + 'World!'
# print(string)

## Checking for string contents

# txt = "The rain in Spain stays mainly in the plain"
# x = "ainy" in txt
# print(x)


string = "abc ncmm ammdm abcgmail.com ancn ancn \\n"
#print(string)

print(string[:])
print(string[::-1])

# print(string.reverse())
# print("@" in string)
# #dir(str)

#dir (str)
abc ncmm ammdm abcgmail.com ancn ancn \n
n\ ncna ncna moc.liamgcba mdmma mmcn cba

Collections in Python

There are four collection data types in the Python programming language:

  • List is a collection which is ordered and changeable. Allows duplicate members.
  • Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
  • Set is a collection which is unordered and unindexed. No duplicate members.
  • Dictionary is a collection which is unordered, changeable and indexed. No duplicate members.

When choosing a collection type, it is useful to understand the properties of that type.

In [68]:
## Lists in Python []
# List is a collection which is ordered and changeable. Allows duplicate members.

lst1 = [1,2,3,4,5,3,3,2,6]
lst2 = ['abc','def','ghi']
lst3 = [1,'abc',2.3,True]


#print(lst1[::-1])
# lst_temp = lst1.reverse()
# print(lst1)


#All access mechanisms remain the same

# 2+2
# '2'+'2'

# lst_new = lst1*2
# print(lst_new)

# #Looping through lists

#func_add() - map()

for i in range(len(lst1)):
    lst1[i]+=2
    
print(lst1)

#Checking if item exists
[3, 4, 5, 6, 7, 5, 5, 4, 8]
In [89]:
# Manipulating lists

# # Append
# thislist = ["apple", "banana", "cherry"]
# thislist.append("orange")
# print(thislist)

# Insert
thislist = ["apple", "banana", "cherry"]
thislist.insert(1, ["orange","banana"])
print(thislist)

# # # Remove
# thislist = ["apple", "banana", "cherry","banana"]
# thislist.remove("banana")
# print(thislist)

# # Delete
# thislist = ["apple", "banana", "cherry","banana"]
# del thislist[0]
# print(thislist)
# print(thislist[0])

# # Pop
# thislist = ["apple", "banana", "cherry"]
# print(thislist.pop(1))
# print(thislist)

# # Clear
# thislist = ["apple", "banana", "cherry"]
# thislist.clear()
# print(thislist)

# # Join two lists
# list1 = ["a", "b" , "c"]
# list2 = [1, 2, 3]

# list3 = list1 + list2
# print(list3)

# list1.extend(list2)
# print(list1)
['apple', ['orange', 'banana'], 'banana', 'cherry']
In [88]:
# How are lists different from strings?

string = 'abcd'
lst = ['a','b','c','d']

# lst[1] = 'z'
# print(lst)

# new_string = string.replace('b','z')
# print(string)
# print(new_string)

# matrix_1 = [[1,2,3],[4,5,6],[7,8,9]]
# print(matrix_1)

lst = list((1,2,3))
print(lst)
[1, 2, 3]
In [92]:
#Function Parameters

var = [5]

def change_var(func_var):
    func_var[0] = 3

change_var(var)

print(var)

# what if we used a list? 

# Passing a function as a parameter
[5]
In [106]:
## Tuples ()
# A tuple is a collection which is ordered and unchangeable. In Python tuples are written with round brackets.

#dir(list)
#dir(tuple)

# Why use Tuples?
# Immutable

# var = (5,2)

# def change_var(func_var):
#     func_var[0] = 3

# change_var(var)

# print(var)

# Return multiple values

def func_tuple(num):
    return (num-1, num+1)

num1 = func_tuple(4)

print(num1[0],num1[1])

string1 = "abc"
string2 = "123"
string3 = "abc123"

print(string1.isalpha())

print(string2.isalpha())

print(string3.isalnum())
3 5
True
False
True
In [127]:
## Sets {}

# A set is a collection which is unordered and unindexed. Does not allow for duplicate members.
# You cannot access items in a set by referring to an index, since sets are unordered the items has no index.
# Access items using 'for' loop or 'in'

fruitset = {'apple', 'banana', 'guava','guava'}
#fruitset = set(['apple', 'banana', 'guava','guava'])
#fruit_lst = list(fruitset)


fruitset.add('orange')


#update
fruitset.update(['papaya', 'mango'])
#print(fruitset)

# #remove - gives error if item does not exist

# fruitset.remove('papaya')
# print(fruitset)

# # discard - does not give error

# fruitset.discard('banana')
# print(fruitset)

# union, intersection, ... all set operations

fruitset1 = {'orange','banana'}

print(fruitset.intersection(fruitset1))

print(fruitset)
{'banana', 'orange'}
{'papaya', 'banana', 'orange', 'apple', 'guava', 'mango'}
In [207]:
## Dictionaries {}

# A dictionary is a collection which is unordered, changeable and indexed. 
# In Python dictionaries are written with curly brackets, and they have keys and values.

inventory_dict = {
  "apples": 4,
  "oranges": 7,
  "guavas": 2
}
# print(inventory_dict)

# lst_dict = inventory_dict.items() #returns a list of tuples

# # for key in inventory_dict:
# #     print (inventory_dict[key])
    
# inventory_dict["apples"] = inventory_dict["apples"]-1
# print(inventory_dict["apple"])

# print(inventory_dict.values())

inventory_dict["apple"]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-207-13848274ec0b> in <module>
     21 # print(inventory_dict.values())
     22 
---> 23 inventory_dict["apple"]

KeyError: 'apple'
In [210]:
# working with dictionaries - 

# check if key exists - 
'apples' in inventory_dict

# accessing data

print(inventory_dict.get("apple",0))

# changing values
#car_dict["brand"] = "Toyota"
None
In [145]:
a = {'a':2, "a":4}
n = a['a']
print(a)
{'a': 4}
In [183]:
# random number - 2401

# #user input - 1307

# 1 - cow
# 1 - bull

# 1234

# 0 - cows
# 3 - bulls

1

# Generate a random number

# split random number into four digits

# user input - split that into four digits

check_list = []

for i in '2401':
    check_list.append(i in '1234')
    
check_list
Out[183]:
[True, True, False, True]

What have we done till now?

  • Operators and operands
  • Writing expressions
  • Variables, variable types, and typecasting
  • Using internal functions
  • printing to the screen
  • reading input from user
  • Function overriding
  • How to ask for help from Python? dir() and help()

  • Exception handling

  • Conditional execution
  • Loops in Python - while and for
  • Defining functions - code encapsulation
  • Classes and Objects - code and data encapsulation
  • Strings (initial discussion)

Session 4

  • Strings
  • Collections in Python
    • Lists
    • Tuples
    • Sets
    • Dictionaries

Agenda for today

  • Passing functions as parameters - Map, Filter
  • List comprehension
  • Collections deep dive (Live exercises)
    • strings, lists, tuples, dictionaries
  • Reading from and writing to files

Lecture recording - Python for ML, Session 5

In [257]:
from IPython.display import IFrame
IFrame("https://drive.google.com/file/d/1do06kYxwj-YygZcp0phRW8Kz0ozSBBi0/preview", width="640", height="480")
Out[257]:
In [197]:
# map function in python

def double_func(num):
    return num*2

num_lst = (1,2,3,4)
dbl_lst = []

# for i in num_lst:
#     dbl_lst.append(double_func(i))
    
dbl_lst

dbl_lst = map(double_func, num_lst)

print(type(dbl_lst))

print(list(dbl_lst))
<class 'map'>
[2, 4, 6, 8]
In [189]:
#List comprehension

user_input = input("input number ")

sq_lst_all = [num**2 for num in num_lst]

sq_lst_even = [num**2 for num in num_lst if num%2==0]
In [191]:
print(sq_lst_all)
print(sq_lst_even)
[1, 4, 9, 16]
[4, 16]
In [200]:
# filter function in python

# function that filters vowels 
def check_vowel(variable): 
    vowels = ['a', 'e', 'i', 'o', 'u'] 
    if (variable in vowels): 
        return variable
    else: 
        return False
  
  
# sequence 
sequence = ['g', 'e', 'e', 'j', 'k', 's', 'p', 'r'] 
  
# using filter function 
filtered = filter(check_vowel, sequence) 

list(filtered)

print(type(filter))
<class 'type'>
In [195]:
sequence = ['g', 'e', 'e', 'j', 'k', 's', 'p', 'r'] 
vowels = ['a', 'e', 'i', 'o', 'u']

vowel_lst = [letter for letter in sequence if letter in vowels]
vowel_lst
Out[195]:
['e', 'e']
In [ ]:
# Live exercise: Work with strings and lists - find the first email id in a given string
In [204]:
str1 = 'abcd ajdjd adjdj ani@gmail.com ajdjd djj ani@hotmail.com mamd'

def emailfinder_func(string):
    
    words = string.split()
    
    for word in words:
        
        if word.find('@') == -1:   #word does not contain @
            continue
        else:                      #word contains @
            return word
    


    
emailfinder_func(str1) 


help(str.find)

    
Help on method_descriptor:

find(...)
    S.find(sub[, start[, end]]) -> int
    
    Return the lowest index in S where substring sub is found,
    such that sub is contained within S[start:end].  Optional
    arguments start and end are interpreted as in slice notation.
    
    Return -1 on failure.

Double-click here for the solution.

In [216]:
# Live exercise: Take a string input and count occurence of each letter in the string

string = 'abcnde'
letters = list(string)
letters
Out[216]:
['a', 'b', 'c', 'n', 'd', 'e']
In [219]:
def letter_counter(string):
    
    letter_dict = dict()
    #words = string.split() #list of all words in the string
    
    for letter in string:
        # check if any of the special characters are in the word
        if letter != ' ':
            letter_dict[letter.upper()] = letter_dict.get(letter.upper(),0)+1
    
    return letter_dict

letter_counter('abcd ajdjd adjdj ani@gmail.com ajdjd djj ani@hotmail.com mamd cat Cat doG dog Dog')
Out[219]:
{'A': 11,
 'B': 1,
 'C': 5,
 'D': 12,
 'J': 8,
 'N': 2,
 'I': 4,
 '@': 2,
 'G': 4,
 'M': 6,
 'L': 2,
 '.': 2,
 'O': 6,
 'H': 1,
 'T': 3}

Double-click here for the solution.

In [ ]:
# Live exercise: Take a string input and count occurence of each word in the string
In [215]:
def word_counter(string):
    
    word_dict = dict()
    words = string.split() #list of all words in the string
    
    for word in words:
        # check if any of the special characters are in the word
        
        word_dict[word] = word_dict.get(word,0)+1
    
    return word_dict

word_counter('abcd ajdjd adjdj ani@gmail.com ajdjd djj ani@hotmail.com mamd cat Cat doG dog Dog')
Out[215]:
{'ABCD': 1,
 'AJDJD': 2,
 'ADJDJ': 1,
 'ANI@GMAIL.COM': 1,
 'DJJ': 1,
 'ANI@HOTMAIL.COM': 1,
 'MAMD': 1,
 'CAT': 2,
 'DOG': 3}
In [213]:
"abc's".isalpha()
Out[213]:
False

Double-click here for the solution.

In [228]:
# Live exercise: Read the code - read from a file and list top 10 words by occurence.

import urllib.request

target_url = 'https://raw.githubusercontent.com/anikannal/PythonForMachineLearning/master/email_dump.txt'
data = urllib.request.urlopen(target_url)
text = str(data.read())

words = text.split()
counts = dict()
for word in words:
    counts[word] = counts.get(word,0)+1

countlist = []

for word,count in counts.items():
    countlist.append((count,word))
    
countlist.sort(reverse=True) #sorts in descending order

print (countlist[:10])
[(352, 'Jan'), (270, '2008'), (218, 'from'), (203, '4'), (194, 'with'), (142, 'Fri,'), (136, 'id'), (110, 'by'), (108, 'paploo.uhi.ac.uk'), (108, 'nakamura.uits.iupui.edu')]

Double-click here for the solution.

In [232]:
# Live exercise: Read the code - read from a file and count emails from each email id.

import re

def count_message_from_email():
    fhand = open('./email_dump.txt')
    
    messages = dict()

    for line in fhand:
        line = line.rstrip()
        if line.startswith('From'): #re.search('^From',line): #check if line starts with From
            
            split_line = line.split()
            messages[split_line[1]] = messages.get(split_line[1],0)+1
    return messages

count_message_from_email()
Out[232]:
{'stephen.marquard@uct.ac.za': 4,
 'louis@media.berkeley.edu': 6,
 'zqian@umich.edu': 8,
 'rjlowe@iupui.edu': 4,
 'cwen@iupui.edu': 10,
 'gsilver@umich.edu': 6,
 'wagnermr@iupui.edu': 2,
 'antranig@caret.cam.ac.uk': 2,
 'gopal.ramasammycook@gmail.com': 2,
 'david.horwitz@uct.ac.za': 8,
 'ray@media.berkeley.edu': 2}

Double-click here for the solution.